AOEHO: A New Hybrid Data Replication Method in Fog Computing for IoT Application

Recently, the concept of the internet of things and its services has emerged with cloud computing. Cloud computing is a modern technology for dealing with big data to perform specified operations. The cloud addresses the problem of selecting and placing iterations across nodes in fog computing. Previous studies focused on original swarm intelligent and mathematical models; thus, we proposed a novel hybrid method based on two modern metaheuristic algorithms. This paper combined the Aquila Optimizer (AO) algorithm with the elephant herding optimization (EHO) for solving dynamic data replication problems in the fog computing environment. In the proposed method, we present a set of objectives that determine data transmission paths, choose the least cost path, reduce network bottlenecks, bandwidth, balance, and speed data transfer rates between nodes in cloud computing. A hybrid method, AOEHO, addresses the optimal and least expensive path, determines the best replication via cloud computing, and determines optimal nodes to select and place data replication near users. Moreover, we developed a multi-objective optimization based on the proposed AOEHO to decrease the bandwidth and enhance load balancing and cloud throughput. The proposed method is evaluated based on data replication using seven criteria. These criteria are data replication access, distance, costs, availability, SBER, popularity, and the Floyd algorithm. The experimental results show the superiority of the proposed AOEHO strategy performance over other algorithms, such as bandwidth, distance, load balancing, data transmission, and least cost path.


Introduction
Nowadays, cloud computing has become an essential part of the life of companies, large organizations, and big data. The internet of things uses cloud computing to transfer data through sensors in cloud environments [1][2][3][4][5][6][7]. Cloud computing provides many services to users and is pay-to-use. Cloud computing is also used in farms, networks, factories, companies, and other industrial environments [8][9][10][11][12][13][14][15]. The internet of things is also used in data transfer in many large and medium companies, military police, and medicine. Cloud computing consists of infrastructure (IaaS), platform as a service (PaaS), and top-layer software as a service (SaaS) [16][17][18][19][20][21][22]. In addition, cloud computing environments are cheaper • Design a discrete AOEHO strategy for solving the dynamic data replication problem in a fog computing environment. • Improving a swarm intelligent technique based on the hybrid aquila optimizer (AO) algorithm with the elephant herding optimization (EHO) for solving dynamic data replication problems in the fog computing environment. • Developing a multi-objective optimization based on the proposed AOEHO to decrease the bandwidth to enhance the load balancing and cloud throughput. It evaluates data replication using seven criteria. These criteria are data replication access, distance, costs, availability, SBER, popularity, and the Floyd algorithm.

•
The experimental results show the superiority of the AOEHO strategy performance over other algorithms, such as bandwidth, distance, load balancing, data transmission, and least cost path.
The rest of this paper is organized as follows. Section 2 introduces related work. Section 3 presents the proposed strategy. Section 4 presents the evaluation results-finally, Section 5 presents the conclusion and future work.

Related Work
Many related studies have researched data replication strategies in the cloud, as follows: Create a schema for data replication among nodes while preserving privacy and secrecy in fog computing. K. Sarwar et al.,in [31] suggested two cross-node replication privacy techniques that were implemented for data security, reliability, and authentication. Compared to other algorithms, the suggested approach fared better regarding memory usage, cost, confidentiality, and privacy. D. Chen et al., suggested the first decentralized system, BOSSA, which works with all parties on blockchain platforms and shows data retrieval and repeatability. BOSSA also uses privacy-enhancing technology to stop decentralized peers, such as blockchain nodes, Sensors 2023, 23, 2189 3 of 21 from drawing personal conclusions from public data. In order to use intelligent nodes on the Ethereal blockchain, we construct a BOSSA-based prototype and present the security analysis in the context of integrity, privacy, and reliability. Our thorough beta reviews show how workable our suggestion is [32].
The task of time, cost, and energy method optimization plan for task scheduling techniques. C. Li et al.,in [33] introduced the Lagrange method to unwinding. This technique considers load balancing, storage, data dependency, data transfer, time, cost, and bandwidth to achieve the shortest data transmission time between nodes. A proposed fault-tolerant task scheduling approach is directed toward cloudlets. The experiments supported the effectiveness of the suggested approach in selecting the best site by the suggested algorithm and transferring data using cloud computing.
T. Shiet et al., presented a novel strategy called multi-cloud application deployment (MCApp). MCApp combines domain-specific big-node search with iterative mixed integer linear programming to streamline the deployment of data replication and user requests. The trials that validated the performance of the suggested approach using actual data and datasets show that MCApp performs noticeably better than other algorithms [34]. A. Majed et al. developed a hybrid strategy for peer-to-peer data replication in cloud environments. It efficiently selected the network's best and most ideal nodes. Additionally, it chooses and positions the most accessible and often-used user data files. The outcomes of the experiments revealed enhanced network functionality and decreased user waiting [35].
C. LiA et al., suggested an approach based on the Lagrangian relaxation technique for cloud computing's ideal data replication among nodes. Think about balancing transmission time, bandwidth, and loads. To save money and bandwidth, you can also use the Floyd algorithm. The outcomes demonstrated the suggested algorithm's superiority to competing algorithms [36].
A. Khelifa et al., introduced a plan for regular and dynamic data replication in cloud computing. The proposed approach intends to decrease the time needed to investigate user requests, achieve load balancing, decrease waiting times, and expedite data access. Additionally, it speeds up data transmission and cloud computing transfer. Additionally, a fuzzy logic approach was implemented for replicating data among nodes using select and placement. It turned out that the suggested algorithm was better than other algorithms [37].
B. Mohammadi et al., presented a cloud computing algorithm for deciding on and configuring data replication among nodes. Reduce user wait times by utilizing the hybrid fuzzy logic and ant colony optimization technique to identify the most appropriate and effective nodes for placement data replication. The suggested algorithm fared better than the competition [38]. An overview of the given studies is presented in Table 1.  [34] High availability High cost High performance High response time  Figure 1 shows geographically dispersed nodes containing a host, virtual machines (VMS), memory, a CPU, a block, etc. The proposed system comprises file replication, cloudlets, files, blocks, DCs, hosts, VMS, brokers, replica management, and a replica catalog. In order to complete specific activities, such as accessing data replication across nodes or remote geographic locations, the broker acts as a mediator between the user and the DCs. The different DCs, f1, f2, and fn, are filled with many files and randomly dispersed to the other DCs in the following stages. The suggested system is split into two components: choosing and positioning dynamic data replication through the nodes and accessing data using the quickest and least expensive route. To place data replication close to users and select the quickest and the best route to the ideal contract, data replication determination is based on users' most popular and easily accessible files over time. We combined MOO with HHO to achieve the shortest resource and lowest cost path among nodes in order to maximize cloud computing across nodes.  Figure 1 shows geographically dispersed nodes containing a host, virtual machines (VMS), memory, a CPU, a block, etc. The proposed system comprises file replication, cloudlets, files, blocks, DCs, hosts, VMS, brokers, replica management, and a replica catalog. In order to complete specific activities, such as accessing data replication across nodes or remote geographic locations, the broker acts as a mediator between the user and the DCs. The different DCs, f1, f2, and fn, are filled with many files and randomly dispersed to the other DCs in the following stages. The suggested system is split into two components: choosing and positioning dynamic data replication through the nodes and accessing data using the quickest and least expensive route. To place data replication close to users and select the quickest and the best route to the ideal contract, data replication determination is based on users' most popular and easily accessible files over time. We combined MOO with HHO to achieve the shortest resource and lowest cost path among nodes in order to maximize cloud computing across nodes. This section describes the strategy selection and placement of data replication using a hybrid aquila optimizer (AO) algorithm with the elephant herding optimization (EHO) This section describes the strategy selection and placement of data replication using a hybrid aquila optimizer (AO) algorithm with the elephant herding optimization (EHO) in fog computing. We assume that our proposed system is composed of a certain number of fog nodes (consisting of region point of presences (RPOPs), local point of presences (LPOPs), and gateways), data centers (DCs), internet of things components, and equipment, such as RFID and sensors.

Proposed System and Structure
A region point of presence (RPOP) covers different geographical areas in the proposed strategy and a local point of presence (LPOP) in the proposed strategy. Services are deployed on data nodes or proposed fog nodes on IoT sensors. The fog broker, located on the fog nodes layer, is a crucial part of the suggested method. Task manager, resource monitoring service, and task scheduler are the three steps that make up fog broker. Our dynamic data replication approach based on IoT in cloud computing is essential to the fog computing system. A series of configurations are needed to transport data over fog computing, including selecting and placing data replication cross-nodes. We presume that our suggested strategy includes a specific number of fog nodes, such as data centers, IoT services, and DCs. We organized the suggested method from various geographical areas to select and place data replication between nodes in fog computing. Any DCs, fog nodes, or IoT sensors can be used to distribute services. The AO algorithm uses the EHO algorithm to transfer data via DCs with the least cost path and minimum bandwidth. MOO with a Floyd algorithm was also used to reduce the cost, bandwidth, and speed of data transmission in the fog cloud.

Aquila Optimizer (AO)
The aquila, a bird of prey, occupies second place after humans in intelligence because it has a remarkable ability to hunt and has higher capabilities than other animals. The following sections explain the aquila algorithm and how our proposed algorithm works [39].

•
Step 1: Expanded exploration Aquila rises high and detects the area of the place on a large scale, then attacks the prey vertically in the search area. The Equation can be represented as follows: The X m (t) can also be calculated as follows Xbest(t) is the best position, X M (t) is average position, t and T are the current iteration and max number of iteration, N is the population size, and R is random between 0 and 1.

•
Step 2: Narrowed exploration Aquila uses short methods to attack the prey within the specified area and circles around the prey. These are the most common ways to obtain and attack prey. The Equation can be represented as follows: Sensors 2023, 23, 2189 6 of 21 X R (t) is a random position in aquila, D is the dimension size, where s and β are constant values equal to 0.01 and 1.5, u and v are random numbers between 0 and 1, and y and x are used to present the spiral shape in the search. It can also be calculated as follows: where r1 means the number of search cycles between 1 and 20, D1 is composed of integer numbers from 1 to the dimension size (D), and w equals 0.005.

•
Step 3: Expanded exploitation Aquila exploits the selected area of the foraging area and attacks the prey. Aquila uses methods to locate the prey area and attack vertically on it as a primary method. The behavior is represented as follows: a and δ are the exploitation adjustment parameters fixed to 0.1 and UBj and LBj are the upper and lower bound of the problem.

•
Step 4: Narrowed exploitation Aquila chases the prey during its escape and the path it takes and attacks it on the ground. The equation can be represented as follows: G1 denotes the movement parameter of aquila is a random number between [-1, 1]. G2 denotes the flight slope when chasing prey. X(t) is the current position, and QF(t) represents the quality function value.

Clan-Updating Operator
Elephants have habits according to their clan, and the mother leads the clan according to their nature [40,41]. The equations can be represented as follows: x new,ci,j = x ci,j + a * x best,ci − x ci,j * r (16) where x new , ci,j and x ci,j present the new and old positions for elephant j in clan ci. x best,ci is matriarch, representing the clan's best elephant. a is in the range [0, 1], and r is in the range [0, 1]. The best elephant can be represented as follows: x new,ci,j = β * x center,ci (17) x center,ci is the center individual of clan ci, and β is the range [0, 1]. The equations can be represented as follows in the d-th dimension: x ci,j,d represents the d-th dimension of the elephant individual, and n ci indicates the number of elephants in clan ci.

Separating Operator
Male elephants leave the family separately when solving problems and improving them. The elephant with the worst fitness of every generation defines a group (class). The behavior can be represented as follows: where x max and x min upper and lower bound of the individual. x worst,ci indicates the worst individual in clan ci. Rand is between 0 and 1.

Proposed Swarm Intelligence for Data Replication
This section describes the proposed strategy to define and position replication across nodes in cloud computing environments. For the proposed technique, the shortest path, bandwidth, time, cost, and distance were calculated based on the internet of things via fog computing. Use iFogSim to test the proposed strategy.

Cost and Time of Replication
Cost is a major factor for users to request replication from different geographical locations. The cost varies from one user to another according to the proposed system, the different infrastructure, and the budget of each user. The equation is as follows: DT i Cost of data set dt y z Data replica in the region x y z A binary decision variable q ∈ (1, 2, 3, . . . .. l) p y z Price of replica b y z Bandwidth network between replicas in the region

Shortest Paths Problem (SPP) between Nodes Based on the Floyd
The problem of choosing and arranging dynamic data replication across a geographically dispersed node to the shortest and best channel in terms of data transmission and bandwidth is addressed in this study [40]. In fog computing, the Floyd finds the shortest path between nodes. The weighted length between the shortest path among the DCs is typically obtained while applying the Floyd algorithm in fog computing. The following is a representation of the equations: ai,j is a path from node I to node j matrix m The state transition equation is as follows (Equation): Map [I, J] demonstrates the shortest distance from I to j. K is the breakpoint of exhausting I and j.

Popularity Degree of the Data File
Users who access a file frequently, especially recently, determine its popularity. The file that has been located, cloned, and placed between DCs has recently gained much popularity among users. The equation can be shown as follows: Each file's replication factor (RF i ) is calculated based on the popularity degree as in Equation (25).
The dynamic threshold (TH) value is calculated as in Equation (23).
PD i popularity degree an i number of access w i time-based forgetting factor RF i replica factor RN i number of replicas FS i size of the data file

System-Level Availability
The system's overall high availability is known as system byte effective rate (SBER). Tasks for data replication should allow users access to all files. Access to the most popular files is made possible by regular user access. SBER maintains the file's popularity and accessibility throughout the entire system. An illustration of the equation is as follows:

Placement of New Replicas
It places a dynamic data replica between nodes to choose the shortest possible distances. The best minimum path and the least expensive option for consumers are considered while placing data replication across DCs. Additionally, it can be shown as [28][29][30]:

Computational Complexity
Calculate the time complexity of the proposed strategy AOEHO from tasks IOT application for the number of data repetitions. Calculate the no. of nodes and AO with EHO. Suppose N represents the size of the population, D represents the number of dimensions, T represents the number of tasks, and C represents the cost. The EHO algorithm has a calculated complexity of O(T(D*N + C*N)). Based on the algorithm phase, AOEHO strategy, the time complexity is O(N). Hence, the AOEHO total time complexity is O(N*T*C) and O(N). The main procedure of the proposed method is given in Algorithm 1.  Step 1: Expanded exploration (X1) IF (Fitness X1(t + 1) < Fitness X(t)) X(t) = X1(t + 1) IF (Fitness X1(t + 1) <Fitness (Xbest(t)) Xbest(t) = X1(t + 1) ENDIF ENDIF ELSE Update the current solution using Equation (3).

Configuration Details
The proposed system has been implemented on iFogSim. AOEHO selects and coordinates placement dynamic data replication between fog nodes. In this section, we discuss the configuration and fog cloud for the proposed system. The parameters settings are given in Table 2 [42][43][44].

Different Scenarios of Data Replica Size
We created different scenarios of experiments on select and placement data replication, optimal user access, reduced waiting time, and reduced bandwidth. The proposed strategy was compared with three other strategies (MCS, FFRPP, and NSGAII-DRP) to evaluate their effectiveness of the proposed strategy. The proposed strategy is dynamic on IoT devices according to user access requests and choosing the best and most appropriate way to define and place replication over the IoT on cloud computing.

First Scenario of Tasks
Users submit a set of tasks to optimize data replication under different scenarios. The first scenario contains Figures 2-4 of different data replication sizes, such as 64 and 320 MB. The tasks contain a different number according to the proposed strategy, ranging from 10 to 5000, to calculate the access cost for each task according to the proposed strategy. It considers the availability of data and the access time of each replica, determines the version that enjoys high popularity, selects it, and places it in the path of users. The proposed strategy outperformed other strategies in terms of cost deposited by users.

Different Scenarios of Data Replica Size
We created different scenarios of experiments on select and placement data replication, optimal user access, reduced waiting time, and reduced bandwidth. The proposed strategy was compared with three other strategies (MCS, FFRPP, and NSGAII-DRP) to evaluate their effectiveness of the proposed strategy. The proposed strategy is dynamic on IoT devices according to user access requests and choosing the best and most appropriate way to define and place replication over the IoT on cloud computing.

First Scenario of Tasks
Users submit a set of tasks to optimize data replication under different scenarios. The first scenario contains Figures 2-4 of different data replication sizes, such as 64 and 320 MB. The tasks contain a different number according to the proposed strategy, ranging from 10 to 5000, to calculate the access cost for each task according to the proposed strategy. It considers the availability of data and the access time of each replica, determines the version that enjoys high popularity, selects it, and places it in the path of users. The proposed strategy outperformed other strategies in terms of cost deposited by users.        Second Scenario of Response Time for Tasks Figure 5 shows the data access time with tasks ranging from 1000 to 5000 tasks and selecting files of 64 or 320 MB. Placing files from remote geographical locations and close to users reduces waiting time and speeds up access to optimal replication. The proposed strategy outperformed other strategies in reducing waiting time for users.  Figure 6 shows the number of replications, identifying them and placing them in the path of the two users with the least time to achieve optimal and frequent access to these files from different geographical locations. The optimal nodes will be determined according to the proposed strategy, and the most popular files will be placed to reduce users' attention time. The proposed strategy outperformed other strategies in reducing waiting time for users.  Figure 6 shows the number of replications, identifying them and placing them in the path of the two users with the least time to achieve optimal and frequent access to these files from different geographical locations. The optimal nodes will be determined according to the proposed strategy, and the most popular files will be placed to reduce users' attention time. The proposed strategy outperformed other strategies in reducing waiting time for users. Figure 5 shows the data access time with tasks ranging from 1000 to 5000 tasks and selecting files of 64 or 320 MB. Placing files from remote geographical locations and close to users reduces waiting time and speeds up access to optimal replication. The proposed strategy outperformed other strategies in reducing waiting time for users.  Figure 6 shows the number of replications, identifying them and placing them in the path of the two users with the least time to achieve optimal and frequent access to these files from different geographical locations. The optimal nodes will be determined according to the proposed strategy, and the most popular files will be placed to reduce users' attention time. The proposed strategy outperformed other strategies in reducing waiting time for users. Third Scenario of Execution Time Figure 7 contains the speed implementation of accessing, selecting data replication, and placement of the most popular files across nodes in cloud computing. A scenario of 64 and 320 MB was generated in select and placement data replication to obtain and place optimal replication across cloud computing nodes. The proposed strategy outperformed other strategies in reducing waiting time for users. Third Scenario of Execution Time Figure 7 contains the speed implementation of accessing, selecting data replication, and placement of the most popular files across nodes in cloud computing. A scenario of 64 and 320 MB was generated in select and placement data replication to obtain and place optimal replication across cloud computing nodes. The proposed strategy outperformed other strategies in reducing waiting time for users.  We considered the number of replication and moved it across nodes in cloud computing with the lowest path and cost. We conducted the process of transferring data from x to 100 nodes and the effect of transferring data across nodes in cloud computing. From the reality of the proposed strategy, the greater the number of nodes, the more significant the improvement of the proposed strategy and the prediction of different ways to achieve the lowest path and cost. The proposed strategy outperformed other strategies in reducing waiting time for users. Third Scenario of Execution Time Figure 7 contains the speed implementation of accessing, selecting data replication, and placement of the most popular files across nodes in cloud computing. A scenario of 64 and 320 MB was generated in select and placement data replication to obtain and place optimal replication across cloud computing nodes. The proposed strategy outperformed other strategies in reducing waiting time for users.

Performance Evaluation
Degree of Balancing Figure 14 shows the imbalance over the network fog nodes to perform several different tasks simultaneously. The proposed model lowers the degree of imbalance to a minimum level. The proposed strategy outperformed other strategies in reducing the    Figure 15 shows the data loss rate across nodes in cloud computing. The data transfer rate across 50 nodes reaches a loss rate of 0 in the proposed system. The proposed strategy outperformed other strategies in reducing the data loss rate.  The proposed strategy works on adequate access to data, optimal use of resources, and improved productivity. It places data across nodes in cloud computing and the data transfer rate across the proposed system. It achieves significant resource utilization and  Figure 15 shows the data loss rate across nodes in cloud computing. The data transfer rate across 50 nodes reaches a loss rate of 0 in the proposed system. The proposed strategy outperformed other strategies in reducing the data loss rate.  Figure 15 shows the data loss rate across nodes in cloud computing. The data transfer rate across 50 nodes reaches a loss rate of 0 in the proposed system. The proposed strategy outperformed other strategies in reducing the data loss rate.  The proposed strategy works on adequate access to data, optimal use of resources, and improved productivity. It places data across nodes in cloud computing and the data transfer rate across the proposed system. It achieves significant resource utilization and  Figure 15 shows the data loss rate across nodes in cloud computing. The data transfer rate across 50 nodes reaches a loss rate of 0 in the proposed system. The proposed strategy outperformed other strategies in reducing the data loss rate.  The proposed strategy works on adequate access to data, optimal use of resources, and improved productivity. It places data across nodes in cloud computing and the data transfer rate across the proposed system. It achieves significant resource utilization and

Throughput Time
The proposed strategy works on adequate access to data, optimal use of resources, and improved productivity. It places data across nodes in cloud computing and the data transfer rate across the proposed system. It achieves significant resource utilization and time and cost savings across nodes in cloud computing. An improvement can also be made to reduce congestion across the network in cloud computing. The proposed strategy outperformed other strategies in reducing load balancing and throughput as shown in Figure 17.

Conclusions and Future Work
Cloud computing deals with the internet of things to move data, achieve availability, make data available, and improve data access. In this research, we created a hybrid method called the AOEHO strategy to address the optimal and least expensive path and to determine the best replication via cloud computing. The aquila optimizer (AO) algorithm was combined with the elephant herding optimization (EHO) for solving dynamic data replication problems in the fog computing environment. Additionally, a set of objectives was used to improve the balance between nodes and costs across cloud computing. At the same time, AOEHO's proposed strategy is to find the most popular files and choose the best location for the nodes closest to the users. The proposed strategy also reduces user response time and waiting time. Floyd's algorithm optimized the shortest and most optimal path to select and place replication across nodes in cloud computing. The proposed AOEHO strategy is superior to other strategies regarding bandwidth, distance, load balancing, data transmission, and the least cost path. The proposed algorithm was simulated and evaluated via iFogSim. In future work, according to the efficiency of the AOEHO strategy, it can be applied to address more optimization problems in real-world implementation, including tasks in data replication, data transmission, routing data replication problems, healthcare, energy optimization problems, and multi-hop data between nodes.

Conclusions and Future Work
Cloud computing deals with the internet of things to move data, achieve availability, make data available, and improve data access. In this research, we created a hybrid method called the AOEHO strategy to address the optimal and least expensive path and to determine the best replication via cloud computing. The aquila optimizer (AO) algorithm was combined with the elephant herding optimization (EHO) for solving dynamic data replication problems in the fog computing environment. Additionally, a set of objectives was used to improve the balance between nodes and costs across cloud computing. At the same time, AOEHO's proposed strategy is to find the most popular files and choose the best location for the nodes closest to the users. The proposed strategy also reduces user response time and waiting time. Floyd's algorithm optimized the shortest and most optimal path to select and place replication across nodes in cloud computing. The proposed AOEHO strategy is superior to other strategies regarding bandwidth, distance, load balancing, data transmission, and the least cost path. The proposed algorithm was simulated and evaluated via iFogSim. In future work, according to the efficiency of the AOEHO strategy, it can be applied to address more optimization problems in real-world implementation, including tasks in data replication, data transmission, routing data replication problems, healthcare, energy optimization problems, and multi-hop data between nodes.